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ABSTRACT 

Sequence-specific nucleases represent valuable 
tools for precision genome engineering. 
Traditionally, zinc-finger nucleases (ZFNs) and 
meganucleases have been used to specifically edit 
complex genomes. Recently, the DNA binding 
domains of transcription activator-like effectors 
(TALEs) from the bacterial pathogen Xanthomonas 
have been harnessed to direct nuclease domains to 
desired genomic loci. In this study, we tested a 
panel of truncation variants based on the TALE 
protein AvrBs4 to identify TALE nucleases 
(TALENs) with high DNA cleavage activity. The 
most favorable parameters for efficient DNA 
cleavage were determined in vitro and in cellular 
reporter assays. TALENs were designed to disrupt 
an EGFP marker gene and the human loci CCR5 and 
IL2RG. Gene editing was achieved in up to 45% of 
transfected cells. A side-by-side comparison with 
ZFNs showed similar gene disruption activities by 
TALENs but significantly reduced nuclease- 
associated cytotoxicities. Moreover, the CCR5- 
specific TALEN revealed only minimal off-target 
activity at the CCR2 locus as compared to the cor- 
responding ZFN, suggesting that the TALEN 
platform enables the design of nucleases with 
single-nucleotide specificity. The combination of 
high nuclease activity with reduced cytotoxicity 
and the simple design process marks TALENs as a 
key technology platform for targeted modifications 
of complex genomes. 



INTRODUCTION 

Designer nucleases have developed into invaluable tools to 
modify the genomes of complex organisms. By inserting a 
DNA double-strand break (DSB) into the target locus 
such nucleases activate DNA repair, which can be har- 
nessed to knockout genes or to promote gene targeting 
(1,2). Because the DNA damage response is highly 
conserved in eukaryotic cells, the concept of DSB-based 
genome engineering is easily transferrable between highly 
diverse organisms. Accordingly, designer nuclease-based 
genome engineering has been successfully established in 
more than 10 model organisms thus far, including plants 
(3,4), invertebrates (5), fish (6,7) and mammals (8). 
Moreover, the genome of multipotent and pluripotent 
human stem cells has been efficiently modified (9-12), 
without affecting the differentiation potential of these 
cells. 

Zinc-finger nucleases (ZFNs) comprise the most suc- 
cessful class of engineered nucleases to date. ZFNs 
consist of two functional domains: a customized zinc- 
finger array fused to the non-specific endonuclease 
domain of the well-characterized restriction enzyme 
Fokl. Upon dimerization of the two ZFN subunits in 
correct spacing and orientation, the nuclease domain 
cuts the target DNA within the spacer sequence that sep- 
arates the two target half-sites (13). In a first approxima- 
tion, each zinc-finger within a tandem array recognizes 
three bases at the DNA level (14). However, target site 
overlap and crosstalk between individual fingers in a 
zinc-finger array considerably complicate the production 
of sequence-specific ZFNs (15), requiring the use of 
labor-intense selection procedures to generate zinc-finger 
arrays with sufficient affinity and specificity (1,16,17). Novel 
platforms, such as context-dependent assembly (CoDA), 



*To whom correspondence should be addressed. Tel: +49 511 532 5170; Fax: +49 511 532 5121; Email: cathomen. tonifaimh-hannover.de 
© The Author(s) 2011. Published by Oxford University Press. 

This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/ 
by-nc/3.0), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited. 



9284 Nucleic Acids Research, 2011, Vol. 39, No. 21 



have simplified the design process but the quality of such 
ZFNs may not be sufficient for therapeutic applications 
(18). Furthermore, although modifications in the Fokl 
cleavage domain have been shown to prevent 
homodimerization of the individual ZFN subunits 
(19-22), the genome-wide specificity of ZFNs is still 
under scrutiny. 

The ease of the design process as well as the balance 
between nuclease activity and associated toxicity are key 
parameters in the application of any type of designer 
nuclease (2). Two recent studies identified a novel 
protein scaffold based on transcription activator-like 
effector (TALE) proteins isolated from plant pathogens 
of the Xanthomonas genus to be amenable for engineering 
of customized DNA binding domains (23,24). TALEs are 
modular proteins composed of an N-terminal transloca- 
tion domain, central repeats that collectively mediate 
sequence-specific DNA binding, and a C-terminal 
segment that encompasses nuclear localization signals 
(NLS) and a transcriptional activation domain (25). The 
central TALE DNA binding domain contains a variable 
number (characteristically between 12 and 30) of 
conserved 33-35 residues long repeats arranged in 
tandem arrays. Polymorphisms between repeats are pre- 
dominantly found in positions 12 and 13, also referred to 
as the repeat variable di-residues (RVDs), and RVDs that 
preferentially recognize one of the four bases in the target 
site have been defined (23,24,26,27). Hence, this 'one 
repeat to one base' code enables the prediction of the 
DNA binding sites of natural TALEs or, vice versa, the 
engineering of customized TALE repeat arrays that rec- 
ognize a user-defined target sequence. As a result, TALE 
repeat arrays have attracted great interest as a DNA tar- 
geting tool in the context of designer TALE-type tran- 
scription factors (dTALEs) (26,28) and TALE nucleases 
(TALENs) (27,29-34). 

Here, we have characterized the cleavage parameters for 
efficient TALEN-mediated genome editing in human cells. 
Moreover, we performed a side-by-side comparison 
between engineered TALENs and well-characterized 
ZFNs at two endogenous human loci, CCR5 and 
IL2RG. We show that our designer TALENs can be as 
effective as ZFNs in terms of genome editing activity but 
significantly less cytotoxic. Moreover, our results indicate 
that the TALEN platform enables the design of nucleases 
with single-nucleotide specificity. Given both the ease with 
which TALENs can be engineered and their superior 
toxicity profile, TALENs are likely to have a significant 
impact on targeted genome engineering in the context of 
applied as well as basic biology. 

MATERIALS AND METHODS 

Plasmids 

All TALE derivatives were generated using standard 
cloning procedures. The AvrBs4 and AvrBs3 deletion 
variants were generated by subcloning a BamHI-BamHI 
(A4-BB, A3-BB), Ecol47I-HincII (A4-EH), Narl-HincII 
(A4-NH, A3-NH), Ecol47I-BclI (A4-EC) and Narl-Bcll 
(A4-NC) fragment of plasmids pENTR-D-avr&J and 



pENTR-D-avrBs4 (26), respectively, into vectors 
pRK5.AD or pRK5.N (35). Where indicated, TALENs 
with the obligate heterodimeric KV/EA Fokl variants 
were used (19). All engineered TALEs were subsequently 
cloned into the A4-NH and A4-NC scaffolds. The se- 
quences of all TALEs are indicated in Supplementary 
Figure SI. The luciferase-based reporter plasmid 
(pGLtk.EBE Av ,. Bs4 .Luc) is based on plasmid pGLtk (35) 
and generated by inserting a tandem repeat of EBE AvrBs4 . 
The templates for the in vitro cleavage assays were 
generated by subcloning an inverted repeat of EBE AvrBs4 
separated by variable spacers (from 6-16 bp; 
Supplementary Table SI) into plasmid 
pCMV.LacZSGFP (16). The dsEGFP reporter constructs 
used in the episomal gene disruption assay were generated 
by cloning homodimeric EBE AvrBx4/AvrBs4 or heterodimeric 
EBE AvrBs4 / AvrBs3 elements (Supplementary Table SI), re- 
spectively, between the ATG and the 5'-end of a 
destabilized Enhanced Green Fluorescent Protein 
(dsEGFP) gene into plasmid pLV.CMV.dsEGFP. 
Reporter plasmid pLV.CMV.IL2RG-dsEGFP was 
generated by cloning the IL2RG gene derived from 
plasmid pRRL.MP.IL2RGpre (kindly provided by Axel 
Schambach, Hannover Medical School) into 
pLV.CMV.dsEGFP. All ZFN expression vectors were 
generated by subcloning a synthesized DNA-binding 
domain (GeneArt, Regensburg) into the pRK5.N (35) 
vector backbone, which encodes an N-terminal HA tag 
followed by a nuclear localization domain, and either of 
the obligate heterodimeric Fokl variants KV/EA (19). The 
target sites and the recognition a-helices of the EGFP (17), 
CCR5 (9) and 7L2i?G-specific (36) ZFNs have been 
described. The complete sequences and maps of all 
plasmids can be obtained upon request. 

In vitro cleavage assay 

In vitro cleavage assays were basically performed as pre- 
viously described (21). Briefly, TALENs were expressed 
in vitro using the TnT SP6 Quick Coupled 
Transcription/Translation System (Promega). The ~l-kb 
target DNA fragment was generated by Polymerase Chain 
Reaction (PCR) with Phusion polymerase (Finnzymes), 
primers #78 and #77 (Supplementary Table S2), and 
either of the six different plasmids pCMV.LacZ-X-9GFP 
(X denoting spacer length) as a template. For in vitro 
cleavage, 1 ul of each TnT lysate containing one TALEN 
subunit was mixed with 200 ng of the DNA template and 
1 \xg of BSA in NEBuffer 4 (New England Biolabs) sup- 
plemented with lOOmM NaCl in a total volume of 10 
After incubation for 90min at 37°C the reaction was 
analyzed on a 1.2% agarose gel. 

Transcriptional reporter assay 

All cells were cultured in Dulbecco's modified Eagle's 
medium supplemented with 10% Fetal Bovine Serum 
(FBS) and penicillin/streptomycin (Invitrogen). 
HEK293T cells were seeded in 24-well plates at a density 
of 80 000 cells/well. After 24 h, cells were transfected in 
duplicate using polyethylenimin (PEI) as described before 
(21). Transfection cocktails included 80 ng of reporter 
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plasmid yGUk.EBE AvrBs4 .U\c, 400 ng of dTALE 
encoding plasmids and lOng of pRL (Promega) coding 
for Renilla luciferase to normalize for transfection effi- 
ciency. The amount of DNA was kept constant by 
adding pUC118 to 1 .2 p.g. Cells were harvested 48 h after 
transfection in 1 x PLB lysis buffer (Promega). Firefly and 
Renilla Luciferases activities were measured in a 
luminometer (Berthold Technologies, Bad Wildbach, 
Germany) using Dual-Luciferase Reporter Assay System 
(Promega) following the manufacturer's instructions. 

Gene disruption and quantitative cell toxicity assays 

For episomal gene disruption, 80 000 HEK293T cells were 
seeded per well of a 24-well plate. After 24 h, cells were 
PEI transfected with 50 ng of pLV.CMV.IL2RG-dsGFP 
reporter plasmid, 400 ng of nuclease (TALEN, ZFN, 
I-Scel) expression plasmid, 50 ng of a mCherry expression 
vector (kindly provided by Roger Y. Tsien, UC 
San Diego) to normalize for transfection efficiency, 
and pUC118 to 1.2 ug. For chromosomal gene 
disruption, dsEGFP reporter cells were generated by 
lentiviral transduction (L\ .CMN .EBE Av ,. Bs4xAvrBs3 
.dsGFP; Supplementary Table SI) with a vector dose 
that rendered <1% of cells resistant to geneticin-sulfate 
(0.4mg/ml), so ensuring that cells contained a single copy 
target locus (16). Reporter cells were seeded in 24-well 
plates (80 000 cells/well) and transfected after 24 h with 
400 ng (or 1-600 ng) of nuclease expression plasmids, 
100 ng of a mCherry expression vector, and pUC118 to 
1.2 ug. After 2 and 5 days, the fractions of mCherry- 
positive and EGFP-negative cells were determined by 
flow cytometry (FACSCalibur; BD Biosciences). The cell 
survival rate was calculated as the decrease in the number 
of mCherry-positive cells from Days 2 to 5, normalized to 
cells transfected with a non-functional nuclease expression 
vector (37). 

Genotyping 

Genomic DNA of transfected cells was extracted using 
QIAamp DNA mini kit (Qiagen). The genomic region en- 
compassing the nuclease target sites in dsEGFP or the 
human CCR2, CCR5 and IL2RG loci, respectively, were 
PCR amplified (Supplementary Table S2), and amplicons 
cleaned up with QIAquick PCR Purification Kit (Qiagen). 
The DNA fragments were then subjected to digestion with 
either Xhol or the mismatch-sensitive T7 endonuclease I 
(T7E1; New England BioLabs). For T7E1 assay, DNA 
was denatured at 95°C for 5min, slowly cooled down to 
room temperature to allow for formation of heteroduplex 
DNA, treated with 5U of T7E1 for 15min at 37°C, and 
then analyzed by 2% agarose gel electrophoresis. 

Immunoblotting 

Western blots were performed as described before (21). 
TALEN or P-actin were detected with anti-HA tag 
(1:2000; Novus Biologicals) or anti-(3-actin (1:2000; Cell 
Signaling) antibodies, respectively, and visualized with 
HRP-conjugated anti-rabbit antibody (Dianova) and 
West Pico Chemiluminescence substrate (Thermo 
Scientific). 



Statistical analysis 

All data sets shown as bar graphs represent the average of 
at least three independent experiments. Error bars indicate 
standard error of mean (SEM). Statistical significance was 
determined using a two-tailed, homoscedastic Student's 
/-test. 

RESULTS 

Minimal DNA binding domain of TALE proteins 

The biological function of many Xanthomonas TALE 
proteins has been studied extensively and a recent report 
identified a minimal domain required for efficient DNA 
binding of the TALE protein Hax3 in human cells (28). In 
order to assess whether these findings can be transferred to 
the TALE proteins AvrBs3 and AvrBs4, which were used 
as the design scaffold here, we created some corresponding 
N- and C-terminal deletion variants (Figure la) and fused 
them to the transcriptional activation domain of the VP 16 
protein of the herpes simplex virus (35) (Figure lb). All 
variants contained an N-terminal NLS and a hemagglu- 
tinin (HA) tag, which allowed us to monitor the expres- 
sion levels of these dTALEs in HEK293T cells by western 
blotting (Figure lc). While the dTALE variants with 
extended truncations were expressed to high levels, the 
larger variants A3-BB and A4-BB, respectively, revealed 
lower steady-state levels. Note, the two-letter code before 
the hyphen in the nomenclature of the TALE variants 
refer to specificity: e.g. 'A4' in 'A4-BB' defines a TALE 
domain that recognizes the predicted 19-bp binding 
element of AvrBs4. The two letters following the hyphen 
denote the TALE truncation variant, e.g. 'BB' in 'A4-BB' 
stands for the BamHI-BamHI fragment, as depicted in 
Figure la. The ability of the deletion variants to mediate 
binding to the predicted 19-bp binding element of AvrBs4 
(EBE AvrBs4 ) was determined in a luciferase-based reporter 
assay (Figure lb). While the deletion variants A4-BB, 
A4-NH and A4-NC activated the luciferase reporter con- 
struct containing the EBE AwBs4 sequence motif, reporter 
gene activation by A4-EH and A4-EC was indistinguish- 
able from the control. Moreover, the AvrBs3-based 
variants A3-BB and A3-NH did not activate the 
reporter with the EBE AvrBs4 motif, indicating that activa- 
tion of the reporter gene by the TALE-type transcription 
factors was mediated by a specific interaction of the 
AvrBs4 DNA binding domain with the matching 
EBE AwBs4 sequence motif. Furthermore, these data 
confirm that a region encompassing more than 100 
residues at the N-terminus of the TALE repeat array is 
essential for efficient binding of dTALE proteins to 
matching DNA target sequences. In contrast, the 
residues located C-terminally with respect to the TALE 
repeat domain seem to be dispensable for dTALE-DNA 
interaction. 

Requirements for efficient DNA cleavage by TALENs 

To generate TALE-based nucleases (TALENs), the VP16 
transcriptional activation domain was replaced with 
the catalytic domain of the Fokl endonuclease (38). 
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Figure 1. Minimal DNA binding domain of the AvrBs4 TALE, 
(a) Schematic of AvrBs4 and AvrBs3 deletion variants. The black rect- 
angles represent the 17.5 central tandem repeat arrays that mediate 
DNA recognition. The C-terminal nuclear localization signals (NLS) 
and transcriptional activation domain (AD) are highlighted in gray. 
Deletion variants are generated using the restriction sites indicated 
with letters (B: BamHI; N: Narl; E: Ecol47I; H: Hindi; C: Bell) 
and reported in the respective names (right). The number of remaining 
residues at N- and C-termini (relative to the position of the DNA 
binding domain) and the RVDs with the expected target sequences 
for AvrBs3 and AvrBs4 are shown, (b) Transcriptional reporter 
assay. Designer TALE-based transcription factors (dTALEs) consist 
of the VP16 transcriptional AD fused to the AvrBs4 or AvrBs3 
deletion variants. HEK293T cells were transfected with dTALE expres- 
sion plasmids and the reporter construct containing two upstream 
binding sites for AvrBs4 (AvrBs4), followed by a minimal promoter 
element and the luciferase gene. The graph displays luciferase activity 
normalized for transfection efficiency and relative to transfection with 
empty vector (-). Significant activation above background is indicated 
by */ > <0.05 and **/ > <0.01. (c) dTALE expression levels. Transfected 
HEK293T cells were harvested after 48 h and cell lysates probed either 
with antibodies against HA tag or [5-actin. Lysate of non-transfected 
cells is marked with 



The parameters for TALEN-mediated cleavage were ini- 
tially determined in an in vitro cleavage assay. A linear 
DNA fragment containing an inverted repeat of 
EBE ArrBs 4 separated by spacers ranging from 6 to 16 bp 
was incubated with the in vitro translated TALEN 
variants (Figure 2a). A recognition site for the 
meganuclease I-Scel (39) was included in all target 
DNAs as an internal control. In agreement with the tran- 
scriptional reporter assay, TALEN variants A4-NH and 
A4-NC were able to bind the DNA target and induce 
cleavage, while A4-EH and A4-EC were not. Highest 
activity was observed by variant A4-NH at spacers of 6, 
12 and 16 bp. Variant A4-NC, which harbors a longer 
protein linker between the TALE repeat units and the 
Fokl catalytic domain, showed reduced activity. 
Although variant A4-BB was expected to cleave substrates 
containing the EBE AvrBs4 motif based on the luciferase 
reporter assay (Figure lb), it did not display any 
cleavage activity. None of the TALEN variants induced 
a DSB in DNA substrates with a single EBE ArrBs4 , con- 
firming that TALEN-mediated in vitro cleavage was 
sequence specific and mediated by dimerization of two 
TALEN subunits at the specific target site. 

To verify these results in cells, a fast quantitative 
reporter assay was developed (Figure 2b). An inverted 
EBE AvrBs4 repeat separated by spacers ranging from 6 to 
27 bp was cloned in between the translational start codon 
ATG and the open reading frame (ORF) encoding a 
destabilized EGFP (dsEGFP). A recognition site for 
I-Scel was included as an internal control. Nuclease 
mediated cleavage will either lead to rapid degradation 
of the episomal target plasmid or, alternatively, to 
error-prone DNA repair by the non-homologous 
end-joining (NHEJ) pathway. The resulting reduction in 
dsEGFP fluorescence intensity is hence a measure for 
nuclease activity. In good agreement with the in vitro 
data, expression of TALEN variants A4-NH and 
A4-NC led to a considerable reduction of the mean fluor- 
escence intensity (MFI) in HEK293T cells transfected with 
reporters that contain target sites separated by 12 and 
15 bp, respectively, but not at other spacer lengths. Note 
that some TALEN variants reduced EGFP expression 
from the reporter containing a single AvrBs4 recognition 
site (single EBE; black bars), suggesting that efficient 
binding of these TALENs to the recognition sequence 
was sufficient to reduce the MFI between 0 and 20% by 
interfering with transcription. A similar observation was 
made when expressing a mutated I-Scel (data not shown). 
Interestingly, while the A4-NH variant had a restricted 
activity profile with a pronounced peak at a 12-bp 
spacer, A4-NC was less selective in terms of spacer 
length requirements and showed approximately equal 
activity on spacers from 12 to 21 bp (Figure 2c). Variant 
A4-BB displayed notable cleavage activity over back- 
ground only on targets with 21 and 27-bp spacers, while 
expression of I-Scel reduced dsEGFP expression from all 
reporter plasmids. These results support the assumption 
that variants with longer linkers need longer spacers to 
accommodate the additional amount of protein. 
Assessment of TALEN expression (Figure 2d) consistently 
revealed reduced steady-state levels of the less active 
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Figure 2. Optimal spacer length for DNA cleavage by TALEN. (a) In vitro cleavage activity. TALENs consist of the TALE repeat units preceded by 
an HA tag and an NLS. The C-terminal catalytic domain of the Fokl endonuclease (Fokl) mediates cleavage after dimerization of two TALEN 
subunits. The linear DNA substrate containing an inverted AvrBs4 repeat was incubated with TALENs in vitro and the extent of cleavage analyzed 
by agarose gel electrophoresis. The respective spacer lengths (left) and the positions of the cleavage products (arrows) are indicated, (b) Cleavage 
activity in cellula. HEK293T cells were co-transfected with the respective TALEN expression plasmids and reporter plasmids that harbors an inverted 
repeat of AvrBs4 integrated into the 5'-end of a gene encoding a destabilized EGFP (dsEGFP). The position of a binding site for I-Scel used for 
internal reference is shown. The graph displays reduction of EGFP mean fluorescent intensity (MFI) relative to a non-functional nuclease, as 
determined by flow cytometry. The respective spacer lengths separating the AvrBs4 sites are indicated, (c) Relative TALEN activity in relation to 
spacer length. The graph displays TALEN activity (from [b]) relative to I-Scel at the various spacers, (d) TALEN expression levels. Transfected 
HEK293T cells were harvested after 48 h and cell lysates probed either with antibodies against HA tag or P-actin. Lysate of non-transfected cells is 
marked with 



TALENs A4-EH and A4-EC, which may partially— but 
not exclusively — explain the lack of activity of these 
variants. 

Together with the transcriptional reporter data, these 
results suggest that the structural requirements for 
TALEN-mediated DNA cleavage are divergent from 
dTALE-mediated gene regulation, i.e. while the dTALE 
variant based on A4-BB activated luciferase expression, 
the corresponding TALEN did not efficiently cut the 
DNA target. Furthermore, variants with a short 
C-terminal linker peptide that connects the TALE 
repeats with the Fokl cleavage domain, such as the 17- 
and 47-residue linkers in A4-NH and A4-NC, respectively, 
provide a better scaffold to generate designer TALENs. 



Profiling TALEN activity versus cytotoxicity 

To characterize the target site requirements for efficient 
cleavage by a heterodimeric TALEN in a chromosomal 
context, a HEK293-based reporter cell line with an 
integrated dsEGFP cassette was generated. The dsEGFP 
ORE contained distinct recognition sites for the TAL ef- 
fectors AvrBs3 (EBE AvrBx3 ) and AvrBs4 (EBE AvrBs4 ) in 
opposite orientation (Figure 3a). In order to maintain 
the ORF and keep an optimal distance between the 
EBEs, a 13-bp spacer was chosen. Again, a recognition 
site for I-Scel was included as an internal control. 
Transfection of these cell lines with expression plasmids 
coding for I-Scel or an EGFP- specific ZFN pair (17) 
induced gene disruption in ~22% of reporter cells, 
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Figure 3. Comparison of activity versus toxicity profiles of TALEN 
and ZFN. (a) Designer nuclease-mediated gene disruption. The 
HEK293-based reporter cells harbor an integrated dsEGFP gene that 
contains an inverted heterodimeric AvrBs4/AvrBs3 target sequence 



as determined by flow cytometry. Gene disruption by 
co-expression of TALEN subunit A3-NH and A4-NH 
was induced in >30% of the transfected cells. Of note, 
TALENs that contain an obligate heterodimeric Fokl 
domain (19), as compared to the wild-type Fokl version 
used in this experiment, were slightly less active 
(Supplementary Figure S2). For both TALENs and 
ZFNs gene disruption activity was strictly dependent on 
the presence of both nuclease subunits. 

The extent of TALEN-mediated target site cleavage in 
these reporter cells was subsequently confirmed by 
genotyping (Figure 3b). A genomic fragment encompass- 
ing the TALE target sites EBE AvrBs3 and EBE ArrBx4 was 
amplified by PCR and subjected to digestion with Xhol. 
The Xhol restriction site is located in the spacer sequence 
separating the two target half-sites, which is expected to be 
cleaved by the TALEN. Given that NHEJ-mediated DNA 
repair after a TALEN-induced DSB will frequently lead to 
disruption of the target site, the corresponding PCR 
amplicons will lack the Xhol recognition site and there- 
fore become resistant to Xhol cleavage. The fraction of 
Xhol-resistant PCR products reflecting the cleavage 
activity was in good agreement with the percentage of 
EGFP-negative cells (Figure 3a), confirming high 
activity of the A3-NH/A4-NH TALEN pair. 

To compare directly the activities and 
nuclease-associated toxicities of TALENs and ZFNs, the 
same reporter cell line was transfected with increasing 
amounts of expression vectors ranging from 1 to 600 ng 
of each subunit (Figure 3c). Note that while the TALEN 
pair was equipped with a wild-type Fokl domain, the 
ZFNs contain an obligate heterodimeric cleavage 
domain that was shown to reduce nuclease-associated 
toxicity (19,21). The activity profiles of both classes of 
nucleases were comparable for any of the DNA amounts 
transfected and reached a maximal gene disruption 
activity at about 45% EGFP-negative cells. However, as- 
sessment of cell survival 5 days after transfection revealed 
a significant increase in cytotoxicity for ZFN transfected 
cells at high vector doses. 



Figure 3. Continued 

separated by 13-bp spacer in the 5'-end of the open reading frame. The 
position of the diagnostic Xhol site is indicated. For internal reference, 
a binding site for I-Scel was placed downstream of the TALEN target 
site. The graph shows the percentage of EGFP-negative cells 5 days 
after transfection with the nuclease expression vectors (TALEN. ZFN 
or I-Scel). (b) Molecular characterization. Genomic DNA was ex- 
tracted 5 days after transfection and PCR amplicons encompassing 
the target sites were used as templates for digestion with Xhol. An 
arrow indicates the position of the Xhol-resistant DNA fragment. 
The numbers below designate the percentages of Xhol-resistant PCR 
fragments (note background level of ~6%). (c) Activity versus toxicity. 
The EGFP reporter cells were transfected with increasing amounts 
(1-600 ng) of nuclease expression vectors (TALEN, ZFN or I-Scel) 
and a mCherry-encoding plasmid. The percentage of EGFP and 
mCherry-positive cells was determined by flow cytometry after 2 and 
5 days. The graphs display gene disruptions activities (EGFP-negative 
cells at Day 2; top) and nuclease-associated cytotoxicities (fraction of 
mCherry-positive cells at Day 5 as compared to Day 2 after transfec- 
tion; bottom), relative to cells transfected with a mock plasmid. 
Statistically significant differences in toxicities between TALEN and 
ZFN are indicated by *P<0.05 or **/ > <0.01, respectively. 
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Together with the previous data, these results suggest 
that the optimal spacer length for TALENs based on the 
A4-NH scaffold is 12 or 13 bp, while the A4-NC architec- 
ture, which contains a longer linker between the DNA 
binding domain and the C-terminal Fokl domain, was 
more flexible with respect to spacer length. In summary, 
our data demonstrate that TALEN-mediated gene disrup- 
tion at chromosomal loci can be as efficient as knockouts 
created by ZFNs but with reduced nuclease-associated 
toxicity. 

Efficient disruption of endogenous genes 

Based on the above findings, we aimed at generating 
TALEN pairs that target endogenous genes in the 
human genome. To this end, we designed TALENs to 
target sites in the CCR5 and IL2RG loci (Figures 4 
and 5) that overlap with previously published ZFN recog- 
nition sites (20,40). The corresponding TALE repeat 
domains were cloned into both the A4-NC (47-residue 
linker) and A4-NH (17-residue linker) scaffolds and 
termed GC-NC or GC-NH (targeted to the IL2RG 
locus) and C5-NC or C5-NH (targeted to the CCR5 
locus). Note that while the ZFN half-sites are separated 
by 5 bp, the TALEN target sites contain 15-bp spacers. 

For quantitative comparison of the activities and 
toxicities of the 7L2i?G-specific TALEN (designated 
'GC for gamma chain) and ZFN, an IL2RG-dsEGFP 
reporter construct was generated (Figure 4b). A recogni- 
tion site for I-Scel was placed in between the two ORFs as 
a reference. Co-transfection of the reporter with I-Scel, 
the 7L2^G-specific TALENs or ZFN expression vectors, 
respectively, reduced the MFI of IL2RG-dsEGFP 
between 69% and 90%. This demonstrated that all of 
the tested nucleases were highly active, although the 
TALEN variant GC-NC with the longer linker was 
somewhat more active than GC-NH. Concomitant assess- 
ment of cytotoxicity, on the other hand, revealed an 
almost 2-fold increase in cell survival when comparing 
the GC-NC TALEN versus the ZFN pair, respectively. 
To assess activity of GC-NC at the genome level, 
HEK293T were transfected with the respective TALEN 
or ZFN expression vectors (Figure 4c). The extent of 
NHEJ induced insertions/deletions after target site 
cleavage was quantified using the mismatch-sensitive 
T7E1 endonuclease (41). A direct comparison indicated 
that the engineered 7L2^G-specific TALEN pair was 
about half as active as the well-established ZFN, with 
targeted allelic modification frequencies of 14% for the 
TALEN and 37% for the ZFN. A CCi?5-specific 
TALEN served as a negative control at the IL2RG locus. 

Then the TALENs designed to target the CCR5 locus, 
C5-NC and C5-NH (Figure 5a), were transfected into 
HEK293T. Genomic DNA was isolated and subjected to 
T7E1 -based genotyping. A side-by-side comparison 
indicated that the engineered C5-NC TALEN was as 
active as the well-established ZFN, with allelic modifica- 
tions at 17 and 14%, respectively (Figure 5b). As seen 
before for the 7L2ivG-specific TALENs, the C5-NH 
TALEN with the shorter linker was not as active as 
C5-NC. 
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Figure 4. Modifications at the human IL2RG locus, (a) Target sites in 
human IL2RG gene. Target sites are highlighted by gray shaded 
(TALEN) or black (ZFN) boxes, respectively. The RVDs of the engin- 
eered TALEs as well as the expected target sequences are indicated, 
(b) Comparison of the activities and toxicities of /L2/?G-specific 
designer nucleases. HEK293T cells were co-transfected with expression 
plasmids-encoding 7L2^G-specific TALEN and ZFN, respectively, and 
a reporter plasmid harboring an IL2RG-dsEGFP fusion gene. The 
graphs display reduction of the mean fluorescent intensity (MFI) of 
IL2RG-dsEGFP (left) and relative cell survival as compared to cells 
expressing a nonfunctional nuclease indicated with '— ' (right). NH and 
NC designate the TALEN scaffold used. Statistically significant 
differences in activity and toxicity between the TALEN and ZFN 
tested are indicated by *(/ 3 <0.05) or **(P<0.01), respectively, (c) 
Disruption of endogenous human IL2RG locus. After transfection 
with TALEN (NC scaffold) and ZFN expression vectors, genomic 
DNA was extracted and used as a template for PCR amplification. 
Amplicons encompassing the target sites were digested with the 
mismatch-sensitive T7 endonuclease 1 (T7E1). Arrows indicate 
the expected positions of the T7E1 digestion products. Numbers at 
the bottom designate the average percentage of modified alleles 
(n = 2). TALENs targeting the CCR5 locus (C5-TALEN) were used 
as negative controls. L and R refer to the nuclease subunits, binding 
to the left or right target half-sites, respectively. 



Specificity of designer nucleases is an important param- 
eter. A major off-target locus of the CCiv5-specific ZFN 
has been identified in CCR2 (40,42), which shares a high 
degree of sequence identity with the CCR5 locus 
(Figure 5c). The ZFN target site in CCR5 differs from 
the corresponding site in CCR2 in two positions, one in 
each target half site. In contrast, the entire 19-bp target 
sequence of the left TALEN subunit is also found in 
CCR2 while the right target half-site varies at only one 
position bound by a TALE repeat unit and an additional 
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Figure 5. Modifications at the human CCR5 locus, (a) Target sites in 
human CCR5 gene. Target sites are highlighted by gray shaded 
(TALEN) or black (ZFN) boxes, respectively. The RVDs of the engin- 
eered TALEs as well as the expected target sequences are shown. Bold 
letters designate the nucleotides not conserved in the CCR2 locus, 
(b) Disruption of endogenous human CCR5 locus. After transfection 
with TALEN and ZFN expression vectors, genomic DNA was ex- 
tracted and PCR amplicons encompassing the CCR5 target sites 
digested with T7E1. Arrows point out the expected positions of the 
T7E1 digestion products. Numbers at the bottom indicate the 
average percentage of modified alleles (n = 3). TALENs targeting 
the IL2RG locus (GC-TALEN) were used as a negative control. 
NH and NC designate the TALEN scaffold, (c) Off-target activity at 
CCR2 locus. Asterisks in the alignment of CCR5 and CCR2 designate 
the mismatches between the two target sequences, bold letters the 
nucleotides affecting binding of the designer nucleases. PCR amplicons 
encompassing the CCR2 target locus were digested with T7E1. Arrows 
point out the expected positions of the T7E1 digestion products. 
Numbers at the bottom designate the average percentage of modified 
alleles (n = 2). (d) Cytotoxicity of CCR5- specific designer nucleases. 
The graph displays nuclease-associated cytotoxicities relative to cells 
expressing a nonfunctional nuclease (-). Statistically significant differ- 
ences in toxicities between TALENs and ZFN are indicated by 
**.P<0.01. 



one recognized by the Oth repeat (Figure 5a). T7El-based 
genotyping at the CCR2 locus revealed that expression of 
the CCi\5-specific ZFN-induced mutations at 11% of 
CCR2 alleles, while only 1% of CCR2 alleles were 
mutated after TALEN expression. Parallel assessment of 
cytotoxicity revealed a 2-fold increase in cell survival when 
comparing the CCi\5-specific TALEN with the ZFN pair 
(Figure 5d). 



DISCUSSION 

Designer nucleases have evolved into invaluable tools for 
targeted genome engineering. In particular, ZFNs have 
been successfully employed in a broad variety of 
research fields, ranging from fundamental to applied 
science. A major drawback of ZFNs, however, is the elab- 
orate and time-consuming experimental selection process 
(1,17). Although simplified methods, such as modular 
assembly and CoDA have been reported (18,43,44), the 
quality of ZFNs generated by such platforms is contro- 
versial (45-47) or not determined yet. In this study, we 
have characterized the cleavage parameters for 17.5 
repeats containing TALENs based on the AvrBs4 
scaffold. This TALEN scaffold mediates binding to a 
total of 19 bp, including the invariant thymine (in 
position-1) that precedes the RVD-defined nucleotides 
in the TALE target box (24). By employing different 
in vitro and cellular reporter assays, we have defined a 
TALEN architecture that allows for efficient DNA 
cleavage. Depending on the length of the linker that 
connects the TALE DNA binding domain with the Fokl 
cleavage domain, we found that spacers of 12-15 bp 
between the two target half-sites are optimal for high 
TALEN cleavage activity. 

Similar to our approach. Miller et al. (27) produced a 
set of TALENs with truncated TALE domains. Their 
variants with 28 or 63 residue long C-terminal linkers 
proved to be the most active configurations and are 
similar to our A4-NH and A4-NC scaffolds with 17- 
and 47-residue linkers, respectively. In the context of 
TALE transcription factors, Zhang et al. (28) showed 
that 147 residues N-terminally of the TALE repeat units 
are essential for efficient DNA binding. This is in good 
agreement with our A4-NH and A4-NC scaffolds (+153 
amino acids) as well as the architecture used by Miller 
et al. (27). Although we barely detected activity of our 
longer TALEN variant A4-BB, TALENs similar to this 
scaffold have been used in other studies (32,33). Given 
that the respective TALEN activities in these reports 
were determined using different methods, in different or- 
ganisms, and at target sites with longer spacers that may 
accommodate the longer linkers, it is difficult to compare 
the results directly. Nonetheless, our data imply that a 
very long stretch of residues between the TALE repeat 
units and the Fokl nuclease domain may place the cata- 
lytic center in an unfavorable position that impairs 
activity. Alternatively, lower nuclease activity of the 
longer TALEN variants is simply a result of decreased 
protein stability, as shown in our immunoblots. 
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Some of the aforementioned studies also explored in 
detail the spacer length requirements between the two 
target half-sites. In accordance with studies performed 
on ZFNs (48,49), the protein linker that connects the 
DNA binding domain with the nuclease domain influences 
selectivity with regard to DNA spacer length and hence 
ultimately specificity. While TALENs with long linkers 
(>200 amino acids) seem to work on a very broad range 
of spacers spanning 14^10 bp (29,30), the shorter linkers 
(17-63 residues) used in our study and in Miller et al. (27) 
restrict activity to spacers of 12-22 bp. Nonetheless, these 
results are in sharp contrast to ZFNs where activity can be 
restricted to 5/6-bp spacers using a short 4-residue linker 
(49), so more effort in optimizing the architecture of 
TALEN has to be invested to further restrict the activity 
range of these nucleases in the future. 

The design process to generate TALENs is based on a 
simple modular assembly strategy. As opposed to ZFNs, 
in which an individual zinc-finger module contacts in a 
first approximation a 3-bp target subsite (14), each 
TALE repeat unit recognizes a single nucleotide (23,24). 
We have used the four RVDs NN to target guanine, NI 
for adenine, NG for thymine and HD for cytosine. Some 
recent reports showed a stronger preference of the NK 
repeat to bind guanine than NN (26,27), so it will be inter- 
esting to see whether exchanging the respective RVDs will 
increase activity and further reduce toxicity of designer 
TALENs. 

Overall, the success rate for generating TALENs by 
simple modular assembly seems very high. We have 
used the natural TALEs AvrBs4 and AvrBs3 in the 
context of a TALEN pair to disrupt an EGFP marker 
and designed two additional TALEN pairs that recognize 
target sites in two endogenous sites in the human genome. 
These nucleases induced modifications in 14 or 17% of the 
targeted IL2RG or CCR5 alleles, respectively, or 45% of 
the EGFP reporter. These numbers are comparable to 
data reported by Miller et al. (27) who generated 
TALENs based on a different TALE backbone and 
different numbers of TALE repeats. The authors 
targeted the human CCR5 and NTF3 genes with 
efficiencies of up to 27% and report that all of their 
TALEN pairs designed to recognize sites with a 
12-21 bp spacer yielded at least 5% gene editing (27). 
Furthermore, Cermak et al. (32) described design guide- 
lines based on natural TALEs that may further improve 
the success frequency. 

To enable a side-by-side analysis with ZFNs, the en- 
dogenous target sites in CCR5 and IL2RG were chosen 
to overlap with binding sites of previously described ZFNs 
(20,40). On the whole, our designed TALENs showed 
similar gene disruption activities as some extensively 
optimized ZFNs, such as the CCR5-specific ZFN pair 
that has been employed in two clinical trials 
(NCT00842634, NCT01044654). On the other hand, the 
TALENs used here were generally less cytotoxic than their 
ZFN counterparts, suggesting better specificity. In fact, in 
a side-by-side analysis of the CCRJ-specific designer nu- 
cleases, the TALEN showed significantly reduced 
off-target activity at the CCR2 locus as compared to the 
ZFN. Remarkably, the CCR5-specific TALEN induced 



mutations at 17% of CCR5 alleles and at only 1% of 
the highly homologous CCR2 locus. In contrast, activity 
of the CCi?5-specific ZFN was almost comparable at the 
two loci, with mutation frequencies of 14% at CCR5 and 
11% at CCR2. For both ZFN and TALEN the respective 
target sites differ at two positions each: 2 out of 24 nt for 
the ZFN and 2 out of 38 nt for the TALEN pair. One of 
the two mismatches between CCR5 and CCR2 coincides 
with the 5' terminal T of the right TALE binding box 
(Figure 5a). Previous studies have shown that this T in 
position —1 of the recognition site pairs with the 
postulated 0th TALE repeat and is critical for the 
TALE-DNA interaction (24,50). Although further 
studies are needed, it seems that the conserved 5' T nu- 
cleotide of the EBE sites are particularly well-suited to 
discriminate two nearly identical target sites. 

As mentioned above, all TALEN pairs used in this 
study recognize a 38-bp target site while the binding 
sites for the corresponding ZFNs were 18 or 24 bp long. 
While much more work will be required to come to defini- 
tive conclusions, it is tempting to speculate that the gen- 
erally longer recognition sites of TALENs as compared to 
ZFNs go along with higher specificities and therefore less 
toxicity. 

Depending on the specific needs, other functional 
domains can be fused to the TALE repeat units to 
create artificial proteins able to modify not only the 
genome, but also the epigenome or transcriptome in a 
targeted fashion. The high success rates of the TALE 
modular assembly strategy to produce functional 
designer nucleases or artificial transcription factors (26- 
28,32,33,51) suggest that context-dependent effects 
between individual TALE repeat units are negligible. 
The high number of repeat units with their high degree 
of homology does complicate the assembly of such DNA 
binding domains when using standard cloning 
approaches. However, recently introduced Golden Gate 
based strategies overcome these limitations (28,32,33,51). 
Given the high interest in sequence-specific genome 
surgery, it is conceivable that off-the shelf TALENs for 
each human gene will be available soon. 

In conclusion, the TALEN scaffold presented here 
enables genome editing with high efficiency and precision. 
A side-by-side comparison between our TALENs and 
well-characterized ZFNs showed that the TALE 
platform enables the design of artificial nucleases 
that are as effective as the ZFNs in terms of activity 
but likely more specific and less cytotoxic. Although 
further characterization with regard to specificity will 
be necessary for clinical applications of TALENs, the 
simple 1:1 code, i.e. one TALE repeat unit recognizing 
1 nt, will greatly facilitate the design of customized DNA 
binding domains for basic and applied sciences in the 
future. 
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